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1 A new verification methodology for complex pipeline behavior 
Kazuyoshi Kohno, Nobu Matsumoto 



June 2001 Proceedings of the 38th conference on Design automation 

Additional Information: full citation , abstra ct , references , citings, index 
terms 



Full text available: ||pdfC207^L.KB). 



A new test program generation tool, mVpGen, is developed for verifying pipeline design of 
microprocessors. The only inputs mVpGen requires are pipeline-behavior specifications; it 
automatically generates test cases at first from pipeline-behavior specifications and then 
automatically generates test programs corresponding to the test cases. Test programs for 
verifying complex pipeline behavior such as hazard and branch or hazard and exception, are 
generated. mVpGen has been integra ... 

2 M U6-G. a new design to a ch ieve mainframe performance from a mini-sized computer | 
D. B.G. Edwards, A. E. Knowles, J. V. Woods 



i^. u.vj. uuvvoiuj, r-i. I_. INI IWMIW, J. V. 

May 1980 Proceedings of the 7th annual symposium on Computer Architecture 

Full text available: W^^^MllMi. 



Additional Information: full citation , abstract , references , citings, index 
terms 




MU6-G is a high performance machine useful for general or scientific applications. Its order 
code and architecture are designed to be sympathetic to the needs of the operating system 
and to both the compilation and execution of programs written in high level languages and 
to support a word size suitable for high precision scientific computations. Advanced 
technology, coupled with simplicity of design, is used to achieve a high and more readily 
predictable performance. Innovative features in ... 



Regular contributions: DSP architectures: past, present and futures 
Edwin J. Tan, Wendi B. Heinzelman 

June 2003 ACM SIGARCH Computer Architecture News, volume 31 issue 3 
Full te xt available: ^§.pdf(1 .27JMB) Additional Information: full_cjtatio n. abstract, re fere nces 

As far as the future of communication is concerned, we have seen that there is great 
demand for audio and video data to complement text. Digital signal processing (DSP) is the 
science that enables traditionally analog audio and video signals to be processed digitally 
for transmission, storage, reproduction and manipulation. In this paper, we will explain the 
various DSP architectures and its silicon implementation. We will also discuss the state-of- 
the art and examine the issues pertaining to pe ... 



4 XTREM: a power sim ul ator for t h e Intel XS ca le® co re 

Gilberto Contreras, Margaret Martonosi, Jinzhan Peng, Roy Ju, Guei-Yuan Lueh 
June 2004 ACM SIGPLAN Notices , Proceedings of the 2004 ACM SIG PLAN /SIG BED 
conference on Languages, compilers, and tools, volume 39 issue 7 
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Full text available: ^ pdf( 1 .07 MB) Additional Information: full citation, abstra ct , references, index terms 

Managing power concerns in icroprocessors has become a pressing research problem across 
the domains of computer architecture, CAD, and compilers. As a result, several 
parameterized cycle-level power simulators have been introduced. While these simulators 
can be quite useful for microarchitectural studies, their generality limits how accurate they 
can be for any one chip family. Furthermore, their hardware focus means that they do not 
explicitly enable studying the interaction of different softwa ... 

Keywords: Java, XORP, XScale, hardware performance counters, power measurements, 
power modeling 



FPGA-b ased sonar processing 
Paul Graham, Brent Nelson 

March 1998 Proceedings of the 1998 ACM/SIGDA sixth international symposium on 

Field programmable gate arrays 

„ a , u. « jr/j 0 . ym Additional Information: full citation, abstract, nefereoses, .citings, index 

Full text available: 1B_pdf(1 .2 1 MB) 

^,^__v terms 

This paper presents the application of time-delay sonar beamforming and discusses a multi- 
board FPGA system for performing several variations of this beamforming method in real- 
time for realistic sonar arrays. Additionally, we show that our proposed FPGA system has a 
six to twelve times performance advantage over an equivalent system created using 
currently available, high-performance DSPs designed for multiprocessing systems. This 
performance advantage is due to the simplicity of the core ... 

Pi p el i ned a r chitectures: MaRS: a macr o- pipelined reconf iq urab i e system 
Nozar Tabrizi, Nader Bagherzadeh, Amir H. Kamalizad, Haitao Du 

April 2004 Proceedings of the first conference on computing frontiers on Computing 
frontiers 

Full text available: fll pdfd 93.48 KB) Additional Information: full citation , abstract , references , index terms 



We introduce MaRS, a reconfigurable, parallel computing engine with special emphasis on 
scalability, lending itself to the computation-/data-intensive multimedia data processing and 
wireless communication. Global communication between the processing elements (PEs) in 
MaRS is performed through a 2D-mesh deadlock-free network, avoiding any concerns due 
to non-scalable bus-based communication. Additionally, we have developed a second layer 
of inter-PE connection realized by distributed shared regis ... 

Keywords: 2D-mesh network, MIMD, computer graphics, multimedia, reconfigurable 
architectures, wireless communication 



MetaCore: an application spe c i f ic DS P develop me n t system 

Jin-Hyuk Yang, Byoung-Woon Kim, Sang-Jun Nam, Jang-Ho Cho, Sung-won Seo, Chang-Ho 
Ryu, Young-Su Kwon, Dae-Hyun Lee, Jong-Yeol Lee, Jong-Sun Kim, Hyun-Dhong Yoon, Jae- 
Yeol Kim, Kun-Moo Lee, Chan-Soo Hwang, In-Hyung Kim, Jun-Sung Kim, Kwang-11 Park, Kyu- 
Ho Park, Yong-Hoon Lee, Seung-Hoon Hwang, In-Cheol Park, Chong-Min Kyung 
May 1998 Proceedings of the 35th annual conference on Design automation - Volume 
00 

Full text available: jBj^g3LS9KB) Additional Information: full citation , abstract , references , citings, index 

l l Publisher S ite terrTls - 

This paper describes the MetaCore system which is an ASIP (Application-Specific Instruction 
set Processor) development system targeted for DSP applications. The goal of MetaCore 
system is to offer an efficient design methodology meeting specifications given as a 
combination of performance, cost and design turnaround time. MetaCore system consists of 
two major design stages: design exploration and design generation. In the design 
exploration stage, MetaCore system accepts a set of ... 
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8 Area/de la y estimation fo r dig ital signal processor cores 
Yuichiro Miyaoka, Yoshiharu Kataoka, Nozomu Togawa, Massao Yanagisawa, Tatsuo Ohtsuki 
January 2001 Proceedings of the 2001 conference on Asia South Pacific design 

automation 

Full text available:^) pdf(76.83 KB) Additional Information: full citation , abstract, referenc es, index term 

Hardware/software partitioning is one of the key processes in a hardware/software 
cosynthesis system for digital signal processor cores. In hardware/software partitioning, 
area and delay estimation of a processor core plays an important role since the 
hardware/software partitioning process must determine which part of a processor core 
should be realized by hardware units and which part should be realized by a sequence of 
instructions based on execution time of an input application program a ... 

9 Spinach: a liberty-based simulator for programmable network interface arch Q 
Paul Willmann, Michael Brogioli, Vijay S. Pai 

June 2004 ACM SIGPLAN Notices , Proceedings of the 2004 ACM SIG PLAN /SIG BED 

conference on Languages, compilers, and tools, Volume 39 issue 7 
Full text available: ||| pdf(336.99 KB) Additional Information: full cita tion, abstract , reference s, index terms 

This paper presents Spinach, a new simulator toolset specifically designed to target 
programmable network interface architectures. Spinach models both system components 
that are common to all programmable environments (e.g., ALUs, control and data paths, 
registers, instruction processing) and components that are specific to the embedded 
systems and network interface environments (e.g., software-controlled scratchpad memory, 
hardware assists for DMA and medium access control). Spinach is built on ... 

Keywords: embedded systems, programmable network interfaces, simulation 



1 0 Buildin g a robust software-based router using network processors | 
Tammo Spalink, Scott Karlin, Larry Peterson, Yitzchak Gottlieb 

October 2001 ACM SIGOPS Operating Systems Review , Proceedings of the eighteenth 

ACM symposium on Operating systems principles, Volume 35 issue 5 

„ . , .... j=n Ar iMc>K Additional Information: full citation, abstract, references, offings, jndex 
Full text available: > |l|pdfn.49 MB) terms" 

Recent efforts to add new services to the Internet have increased interest in software-based 
routers that are easy to extend and evolve. This paper describes our experiences using 
emerging network processors— in particular, the Intel IXP1200— to implement a router. We 
show it is possible to combine an IXP1200 development board and a PC to build an 
inexpensive router that forwards minimum-sized packets at a rate of 3.47Mpps. This is 
nearly an order of magnitude faster than existing pure PC-base ... 

1 1 Int e lligent g aze-added interfaces i 
Dario D. Salvucci, John R. Anderson 

April 2000 Proceedings of the SIGCHI conference on Human factors in computing 
systems 

Additional Information: full citation , abstract , references , citings , index 



Full text available: gpdJ^9.29KB} 

We discuss a novel type of interface, the intelligent gaze-added interface, and describe the 
design and evaluation of a sample gaze-added operating-system interface. Gaze-added 
interfaces, like current gaze-based systems, allow users to execute commands using their 
eyes. However, while most gaze-based systems replace the functionality of other inputs 
with that of gaze, gaze-added interfaces simply add gaze functionality that the user can 
employ if and when desired. Intelligent gaze-added inte ... 

Keywords: eye movements, gaze-added interfaces, gaze-based interfaces, intelligent 
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12 Petri Nets Q 
James L. Peterson 

September 1977 ACM Computing Surveys (CSUR), Volume 9 issue 3 

Full text available: 1 ^ pdf(2. 58 MB) Additional Information: full citation , references , citing s, index terms 



1 3 Ar chitecture-level power estimation and design experiments 
Rita Yu Chen, Mary Jane Irwin, Raminder S. Bajwa 

January 2001 ACM Transactions on Design Automation of Electronic Systems 
(TODAES), Volume 6 Issue 1 

Full text available: «pdf(108.08 KB) Additional Information: full citation , abstract, lejeiences, c^ngs, index 
m terms 

Architecture-level power estimation has received more attention recently because of its 
efficiency. This article presents a technique used to do power analysis of processors at the 
architecture level. It provides cycle-by-cycle power consumption data of the architecture on 
the basis of the instruction/data flow stream. To characterize the power dissipation of 
control units, a novel hierarchical method has been developed. Using this technique, a 
power estimator is implemented for a commercia ... 

Keywords: architecture tradeoff, architecture-level power estimation, computer-aided 
design of VLSI, control unit, energy model, energy table, functional unit, hardware/software 
codesign, instruction format transition, low power design, output signal transition, power 
analysis and estimation, switch capacitance 



14 Macros as multi-stage computations: type-safe, generative, binding macros in 

MacroM L 

Steven E. Ganz, Amr Sabry, Walid Taha 

October 2001 ACM SIGPLAN Notices , Proceedings of the sixth ACM SIGPLAN 

international conference on Functional programming, Volume 36 issue 10 
Additional Information: full citation, abstract, references, citings, index 



Full text available: m pdf(233.27 KB) 

^ terms 

With few exceptions, macros have traditionally been viewed as operations on syntax trees 
or even on plain strings. This view makes macros seem ad hoc, and is at odds with two 
desirable features of contemporary typed functional languages: static typing and static 
scoping. At a deeper level, there is a need for a simple, usable semantics for macros. This 
paper argues that these problems can be addressed by formally viewing macros as multi- 
stage computations. This view eliminates the need for fresh ... 

15 Retargetable compilation for low power 
Wen-Tsong Shiue 

April 2001 Proceedings of the ninth international symposium on Hardware/software 
codesign 

Additional Information: full citation , abstract , references , citings, index 



Full text available: fju p_df(469.20 KB) 

^ terms 

Most research to date on energy minimization in DSP processors has focuses on hardware 
solution. This paper examines the software-based factors affecting performance and energy 
consumption for architecture-aware compilation. In this paper, we focus on providing 
support for one architectural feature of DSPs that makes code generation difficult, namely 
the use of multiple data memory banks. This feature increases memory bandwidth by 
permitting multiple data memory accesses to occur in parallel ... 

Keywords: architecture-aware compiler design, high performance and low power design, 
instruction scheduling, register allocation 



http://portal.acm.org/results.cfm?coll-ACM&dl=ACM&CFID=23 



6/28/04 



Results (page 1): two stage MAC instruction 



Page 5 of 6 



16 Design methodologies for ASlPs: Introduction of local memory elements in instruction Q 
set e xtensions 

Partha Biswas, Vinay Choudhary, Kubilay Atasu, Laura Pozzi, Paolo Ienne, Nikil Dutt 
June 2004 Proceedings of the 41st annual conference on Design automation 

Full text available: ^ pdf(250.48 KB) Additional Information: full c it ation , abs trac t, references, inde x t e rms 

Automatic generation of Instruction Set Extensions (ISEs), to be executed on a custom 
processing unit or a coprocessor is an important step towards processor customization. A 
typical goal of a manual designer is to combine a large number of atomic instructions into 
an ISE satisfying microarchitectural constraints. However, memory operations pose a 
challenge for previous ISE approaches by limiting the size of the resulting instruction. In 
this paper, we introduce memory elements into custo ... 

Keywords: ASIPs, ad-hoc functional units, coprocessors, customizable processors, genetic 
algorithm, instruction set extensions 



1 7 Code g eneration for a DSP processor Q 
Wei Kai Cheng, Youn Long Lin 

May 1984 Proceedings of the 7th international symposium on High-level synthesis 

Full text available: ^ pdf(594.45 KB ) Additional Information: full citation , references , citings 



18 Algorithm and architecture of a 1 V low power hearing instrument DSP Q 
Finn M0ller, Nikolai Bisgaard, John Melanson 

August 1999 Proceedings of the 1999 international symposium on Low power 
electronics and design 

Full text available: Q.pdf(570 47 KB). Additional Information: full citation, references, inde x te r ms 



19 Measuring experimental error in microprocessor simulation Q 
Rajagopalan Desikan, Doug Burger, Stephen W. Keckler 

May 2001 ACM SIGSOFT Software Engineering Notes , Proceedings of the 2001 

symposium on Software reusability: putting software reuse in context, 

Volume 26 Issue 3 

Full text available: pdf(1 . 03..MB) Additional Information: full cit ation , references, index terms 



20 InMruction^ g reconfigurable systems 

R. Kastner, A. Kaplan, S. Ogrenci Memik, E. Bozorgzadeh 

October 2002 ACM Transactions on Design Automation of Electronic Systems 
(TODAES), Volume 7 Issue 4 

Full text available- IB pdf(538 25 KB) Additional Information: ful l c itation, abstract, references, citings, index 
• y§| ; terms 

Future computing systems need to balance flexibility, specialization, and performance in 
order to meet market demands and the computing power required by new applications. 
Instruction generation is a vital component for determining these trade-offs. In this work, 
we present theory and an algorithm for instruction generation. The algorithm profiles a 
dataflow graph and iteratively contracts edges to create the templates. We discuss how to 
target the algorithm toward the novel problem of instructi ... 

Keywords: FPGA, high-level synthesis, reconfigurable computing 
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